Sentence Compression as a Step in Summarization or an Alternative Path in Text Shortening
نویسندگان
چکیده
The originality of this work leads in tackling text compression using an unsupervised method, based on a deep linguistic analysis, and without resorting on a learning corpus. This work presents a system for dependent tree pruning, while preserving the syntactic coherence and the main informational contents, and led to an operational software, named COLIN. Experiment results show that our compressions get honorable satisfaction levels, with a mean compression ratio of 38 %.
منابع مشابه
A survey on Automatic Text Summarization
Text summarization endeavors to produce a summary version of a text, while maintaining the original ideas. The textual content on the web, in particular, is growing at an exponential rate. The ability to decipher through such massive amount of data, in order to extract the useful information, is a major undertaking and requires an automatic mechanism to aid with the extant repository of informa...
متن کاملSummarization with a Joint Model for Sentence Extraction and Compression
Text summarization is one of the oldest problems in natural language processing. Popular approaches rely on extracting relevant sentences from the original documents. As a side effect, sentences that are too long but partly relevant are doomed to either not appear in the final summary, or prevent inclusion of other relevant sentences. Sentence compression is a recent framework that aims to sele...
متن کاملNAACL HLT 2009 Integer Linear Programming for Natural Language Processing
Text summarization is one of the oldest problems in natural language processing. Popular approaches rely on extracting relevant sentences from the original documents. As a side effect, sentences that are too long but partly relevant are doomed to either not appear in the final summary, or prevent inclusion of other relevant sentences. Sentence compression is a recent framework that aims to sele...
متن کاملبهبود خلاصه سازی خودکار متون فارسی با استفاده از روشهای پردازش زبان طبیعی و گراف شباهت
A significant amount of available information is stored in textual databases which contains a large collection of documents from different sources (such as news, articles, books, emails and web pages). The increasing visibility and importance of this class of information motivates us to work on having better automatic evaluation tools for textual resources. The automatic summarization of tex...
متن کاملAutomatic Text Summarization Using Two-Step Sentence Extraction
Automatic text summarization sets the goal at reducing the size of a document while preserving its content. Our summarization system is based on Two-step Sentence Extraction. As it combines statistical methods and reduces noise data through two steps efficiently, it can achieve high performance. In our experiments for 30% compression and 10% compression, our method is compared with Title, Locat...
متن کامل